Semantic Lexicons: The Cornerstone for Lexical Choice in Natural Language Generation

نویسندگان

  • Evelyne Viegas
  • Pierrette Bouillon
چکیده

In this paper, we address the issue of integrating semantic lexicons into NLG systems and argue that the problem of lexical choice in generation can be approached only by such an integration. We take the approach of Generative Lexicon Theory (GLT) (Pnstejovsky, 1991, 1994c) which provides a system involving four levels of representation connected by a set of generative devices accounting for a compositional interpretation of words in context. We are interested in showing that we can reduce the set of collocations listed in the lexicon by introducing the notion of "semantic collofations" which can be predicted within GLT framework. We argue that the lack of semantic welldefined calculi in previous approaches, whether linguistic or conceptual, renders them unable to account for semantic collocations. 1 I n t r o d u c t i o n Whether we talk of monolingual or multilingual generation, it is not surprising that there has been very little focus on the area of lexical choice. Lexical choice has often been side-stepped, not because it is a daunting issue, but rather because the interest in natural language generation (NLG) first focused on syntactic, morphological and discourse aspects of language. Semant ic accuracy has been therefore sacrificed in the production of fluent grammaticalsentences. In section 2, we highlight the issue of lexical choice, by arguing tha t generation systems must integrate lexical semantics and focusing on the t rea tment of Adjective-noun (Adj-Noun) collocations. We introduce the notion of "semantic collocations", which allows us to reduce the set of collocations which are usually listed in lexicons. In section 3, we present relevant aspects of the Generat ive Lexicon Theory (GLT), which, we argue, provides a better representation and interpretation of lexical information, enabling us to generate the set of possible semantic collocations in a predictive way without listing them in lexical entries. GLT is still under development from a theoretical point of view and up to now no generation system (as far as the authors are We would llke to thank Susan Armstrong, Paul Buitelaar, Federica Busa, Dominique Estival, James Pustejovsky, Graham Russell and Scott Waterman for their helpful comments. aware) has tried to integrate or implement its ideas. We propose to do so, and are currently s tudying its theoretical adequacy for generation with special reference to the issue of lexical choice. In section 4, we show tha t it is possible to calculate Adj-Noun semantic collocations ( a long book; an easy novel; a fast car) as opposed to the type of collocations where idiosyncrasy seems to be involved ( a large coke vs. a big coke). Finally, in section 5, we emphazise the adequacy of a framework such as GLT to generate the possible set of semantic collocations. 2 T h e I s s u e o f L e x i c a l C h o i c e There is a debate in NLG concerning the place of lexical choice in the generation process. Should lexical choice take place at the level of the "planning component" or the "realization componen t"? Even for generators which do not have a "tradit ional" twocomponent architecture, actions are still sequential and lexical choice takes place after some "planning". Lexical choice relates to lexicalization in the sense of not only needing to pick up the right words or expressions but also of needing to "realize" them or lexicalize them. We would argue on one hand tha t lexicalization does not constitute an au tonomous module within the process of generation, and on the other hand that lexical choice is not the sole prerogative of either the "planning" or the "realization" component. The reason is that a concept cannot be seen in isolation (the choice of a part icular concept will trigger some other related concepts) and when lexicalized, the syntactico-semantics of the lexical i tem will impose some constraints on the further possible choice of concepts to be lexicalized (thus constraining the set of concepts triggered by the previous one). In other words in the process of product ion a l ex i ca l cho ice c a n i n f l u e n c e a c o n c e p t u a l c h o i c e a n d v ice v e r s a . Thus in terms of NLG this means tha t lexical choice has some influence at the level of "planning" and "re-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Multiple, Large-Scale Resources in a Reusable Lexicon for Natural Language Generation

A lexicon is an essential component in a generation system but few efforts have been made to build a rich, large-scale lexicon and make it reusable for different generation applications. In this paper, we describe our work to build such a lexicon by combining multiple, heterogeneous linguistic resources which have been developed for other purposes. Novel transformation and integration of resour...

متن کامل

Multilingual Computational Semantic Lexicons in Action: The WYSINNWYG Approach to NLP

Much effort has been put into computational lexicons over the years, and most systems give much room to (lexical) semantic data. However, in these systems, the effort put on the study and representation of lexical items to express the underlying continuum existing in 1) language vagueness and polysemy, and 2) language gaps and mismatches, has remained embryonic. A sense enumeration approach fai...

متن کامل

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexical knowledge resource...

متن کامل

Automatic Generation Of Multiple Choice Questions From Domain Ontologies

The aim of this paper is to present an innovative approach for generating multiple choice questions in automatic way. Although other approaches have been already reported in the literature, the approach presented in this paper is based on domain specific ontologies and it is independent of lexicons such as WordNet or other linguistic resources. The paper also reports on a first prototype implem...

متن کامل

Towards Best Practice for Multiword Expressions in Computational Lexicons

The importance and role of multi-word expressions (MWE) in the description and processing of natural language has been long recognized. However, multi-word information has often been relegated to the marginal role of idiosyncratic lexical information. The need for MWE lexicons grows even more acute for multi-lingual applications, for which (sometimes complex) correspondences must be identified,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994